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(54) Network echo canceller 

(57) An echo canceller and method for cancelling in 
a return channel signal an echoed receive channel sig- 
nal where the echoed receive channel signal Is com- 
bined by an echo channel with an input return channel 
signal. The echo canceller has a first filter which gener- 
ates first filter coefficients, generates a first echo esti- 
mate signal with the first filter coefficients, and updates 
the first filter coefficients in response to a first filter con- 
trol signal. A first summer subtracts the first echo esti- 
mate signal from a combined return channel and echo 
receive channel signal to generate a first echo residual 
signal. A second filter generates second filter coeffi- 
cients, generates a second echo estimate signal with 
the second filter coefficients, and updates the second 
filter coefficients in response to a second filter control 
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signal. A second summer subtracts the second echo es- 
timate signal from the combined signal to generate a 
second echo residual signal, and provides upon the re- 
turn channel the second echo residual signal. A control 
unit determines from the receive channel signal, the 
combined signal, and the first and second echo residual 
signals, one of a plurality of control states wherein a first 
control state Is indicative of a receive channel signal 
above a first predetermined energy level, wherein when 
the control unit is in the first control state it generates 
the first control signal and generates the second control 
signal when at least one of a first energy ratio of the first 
echo residual signal and the combined signal and a sec- 
ond energy ratio of the second echo residual signal and 
the combined signal exceed a predetermined level. 
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Description 

BACKGROUND OF THE INVENTION 
5 I. Field of the Invention 

[0001 J The present invention relates to communication systems. More particularly, the present invention relates to 
a novel and improved method and apparatus for cancelling echos fn telephone systems. 

10 ||. Description of the Belated Art 

[0002] Every current land-based telephone is connected to a central office by a two-wire fine (called the customer 
or subscriber loop) which supports transmission in both directions. However, for calls longer than about 35 miles, the 
two directions of transmission must be segregated onto physically separate wires, resulting in a four-wire line. The 
15 device that interfaces the two-wire arid four wire segments is called a hybrid. A typical long-distance telephone circuit 
can be described as being two-wire in the subscriber loop to the local hybrid, four-wire over the long-haul network to 
the distant hybrid, and then two-wire to the distant speaker. 

[0003] Although the use of hybrids facilitates long distance speech transmission, impedance mismatches at the 
hybrid may result in echos. The speech of the speaker A is reflected off the distant hybrid (the hybrid closest to the 
20 speaker B) in the telephone network back toward the speaker A, causing the speaker A to hear an annoying echo of 
his/her own voice. Network echo cancellers are thus used in the land-based telephone network to eliminate echos 
caused by impedance mismatches at the hybrids and are typically located in the central office along with the hybrid. 
The echo canceller located closest to speaker A or B is thus used to cancel the echo caused by the hybrid at the other 
end of the call. 

25 [0004] Network echo cancellers, employed in the land-based telephone system, are typically digital devices so as 
to facilitate digital transmission of the signals. Since the analog speech signals need to be converted to digital form, a 
codec located at the central office is typically employed. The analog signals provided from telephone A (speaker A) to 
central office A are passed through hybrid A and are converted to digital form by codec A. The digital signals are then 
transmitted to central office B where they are provided to codec B for conversion to analog form. The analog signals 

so are then coupled through hybrid B to the telephone B (speaker B). At the hybrid B, an echo of the speaker A's signal 
is created. This echo is encoded by the codec B and transmitted back to the central office A. At central office A an 
echo canceller removes the return echo. 

[0005] In the conventional analog cellular telephone system, echo cancellers are also employed and are typically 
located at the base station. These echo cancellers operate in a similar fashion to those in the land-based system to 

35 remove unwanted echo. 

[0006] In a digital cellular telephone system for a call between a mobile station and a land-based telephone, the 
mobile station speaker's speech is digitized using a codec and then compressed using a vocoder, which models the 
speech into a set of parameters. The vocoded speech is coded and transmitted digitally over the airwaves. The base 
station receiver decodes the signal and passes it four-wire to the vocoder decoder, which synthesizes a digital speech 

40 signal from the transmitted speech parameters. This synthesized speech is passed to the telephone network over a 
T1 interface, a time-multiplexed group of 24 voice channels. At some point in the network, usually at the central office, 
the signal is converted back to analog form and passed to the hybrid at the subscriber loop. At this hybrid the signal 
is converted to two-wire for transmission over the wire-pair toward the land-based subscriber phone. 
[0007] For reference purposes, in a cellular call between a mobile station and a land-based telephone, the speaker 

45 in the mobile station is the far-end talker and the speaker at the land-based telephone is the near-end talker. As in the 
land-based system, the speech of the far-end talker (s reflected off the distant hybrid in the telephone network back 
towards the far-end talker. As a result the far-end talker, i.e. mobile station, hears an annoying echo of his/her own voice. 
[0008] Conventional network echo cancellers typically employ adaptive digital filtering techniques. However, the filter 
used normally cannot precisely replicate the channel, thus resulting in some residual echo. A center-clipping echo 

so suppressor is then used to eliminate the residual echo. The echo suppressor subjects the signal to a nonlinear function . 
Synthesized noise can be used to replace signal sections that were set to zero by the center-dipping echo suppressor 
to prevent the channel from sounding "dead". 

[0009] Although the just mentioned echo cancellation approach is satisfactory for analog signals, this type of residual 
echo processing causes a problem in digital telephony. As mentioned previously, in a digital system vocoders are used 
55 to compress speech for transmission. Since vocoders are especially sensitive to nonlinear effects, center-clipping 
causes a degradation in voice quality. Furthermore, the noise replacement techniques used causes a perceptible var- 
iation in normal noise characteristics. 

[0010] It is therefore an object of the present invention to provide a new and improved echo canceller capable of 
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providing high dynamic echo cancellation for improved voice quality. 

[001 1J It is another object of the present invention to provide an echo canceller particularly suited for echo cancellation 
in the coupling of a digital communication system with an analog communication system. 

[0012] It is yet another object of the present invention to provide an echo canceller with improved echo cancellation 
5 performance for cases where both parties are simultaneously talking. 

SUMMARY OF THE INVENTION 

[0013J The present Invention is a novel and improved network echo canceller for digital telephony applications. In 
10 accordance with the present Invention, an echo canceller is employed wherein the impulse response of the unknown 
echo channel Is identified, a replica of this echo is generated using adaptive filtering techniques, and the echo replica 
is subtracted from the signal heading toward the far-end talker to cancei the far-end talker echo. 
[0014] In the present invention, two adaptive filters are used where the step size of each filter is specifically adjusted 
to optimize each filter for different purposes. One filter, the echo canceller filter, performs the echo cancellation and is 
is optimized for high echo return loss enhancement (ERLE). The second filter, the state filter, Is used for state determi- 
nation and is optimized for fast adaptation. 

[0015] The present invention differs markedly from conventional echo cancellers in its treatment of doubletalk, where 
both speakers are talking simultaneously. Conventional echo cancellers cannot detect doubletalk until the adaptive 
filter that tracks the echo channel has already been slightly corrupted, necessitating the use of a nonlinear center- 

20 clipper to remove the residual echo. 

[0016] The present invention also incorporates a variable adaptation threshold. This novel technique halts filter ad- 
aptation immediately at the exact onset of doubletalk, thus preserving the estimated echo channel precisely and ob- 
viating the need for the center-dipping to remove the residual echo. As an added feature, the present invention incor- 
porates an improved method of speech detection, which accurately detects speech even in environments containing 

25 large amounts of background noise. The present invention also utilizes novel techniques that automatically compensate 
for flat-delays in the echo channel, and permit fast Initial adaptation. 

[0017] In accordance with the present invention an echo canceller and method for cancelling in a return channel 
signal an echoed receive channel signal where the echoed receive channel signal is combined by an echo channel 
with an input return channel signal. The echo canceller has a first filter which generates first filter coefficients, generates 

30 a first echo estimate signal with the first filter coefficients, and updates the first filter coefficients in response to a first 
filter control signal. A first summer subtracts the first echo estimate signal from a combined return channel and echo 
receive channel signal to generate a first echo residual signal. A second filter generates second filter coefficients, 
generates a second echo estimate signal with the second filter coefficients, and updates the second filter coefficients 
in response to a second filter control signal. A second summer subtracts the second echo estimate signal from the 

35 combined signal to generate a second echo residual signal, and provides upon the return channel the second echo 
residual signal. A control unit determines from the receive channel signal, the combined signal, and the first and second 
echo residual signals, one of a plurality of control states wherein a first control state is indicative of a receive channel 
signal above a first predetermined energy level, wherein when the control unit is in the first control state it generates 
the first control signal and generates the second control signal when at least one of a first energy ratio of the first echo 

4d residual signal and the combined signaJ and a second energy ratio of the second echo residual signal and the combined 
signal exceed a predetermined level. 

BRIEF DESCRIPTION OF THE DRAWINGS 

45 [001 8] The features, objects, and advantages of the present invention will become more apparent from the detailed 
description set forth below when taken in conjunction with the drawings in which like reference characters identify 
correspondingly throughout and wherein: 

Figure 1 is a block diagram illustrating an exemplary architecture for a digital cellular telephone system and its 
so interface with a land-based telephone system; . 

Figure 2 is a block diagram of a conventional echo canceller; 

Figure 3 is a graph illustrating the regions In an echo channel impulse response; 

Figure 4 is a block diagram of a transversal adaptive filter; 

Figure 5 is a block diagram of the echo canceller of the present invention; 
55 Figure 6 is a block diagram illustrating further details of the control unit of Figure 5; 

Figure 7 Is a flow diagram of the sample data processing for echo cancelling; 

Figure 6 is a flow diagram of the steps involved in the parameter adjustment step of Figure 7; and 

Figure g is a flow diagram of the steps involved in the periodic function computation step of Figure 7; and 
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Figure 10 is a diagram illustrating the circular-end sample buffer and initial filter tap position; 

Figure 1 1 is a diagram illustrating the tap buffer and a copying of the initial filter taps into the state filter and the 

echo canceller fitter, 

Figure 12 is a diagram illustrating the tap buffer and a maximum shift of the filter tap positions of the state filter 
5 and echo canceller filter with respect to the samples; 

Figure 1 3 is a state machine diagram illustrating the various states of the echo canceller; and 
Figure 14 Is a flow diagram of the steps involved in the state machine step of Figure 7. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

10 

[0019] In a cellular communication system, such as a cellular telephone system, which interfaces with a land-based 
telephone system, a network echo canceller located at the base station cancels echos returning to the mobile station. 
Referring now to Figure 1 , an exemplary system architecture is provided for a digital cellular telephone system and Its 
interface to a land-based telephone system. This system architecture is defined by operational elements of mobile 
is station 10, cell or base station 30, mobile telephone switching office (MTSO) 40, central office 50, and telephone 60. 
It should be understood that other architectures may be employed for the system which include a cellular system with 
a mere change in location or position of the various operational elements, ft should also be understood that the echo 
canceller of the present invention may also be used in replacement of conventional echo cancellers in conventional 
systems. 

20 [0020] Mobile station 1 0 Includes, among other elements not shown, handset 1 2, which includes microphone 1 3 and 
speaker 14; codec 16; vocoder 18; transceiver 20 and antenna 22. The mobile station user's voice is received by 
microphone 1 3 where it is coupled to codec 1 6 and converted to digital form. The digitized voice signal is then com- 
pressed by vocoder 1 8. The vocoded speech is modulated and transmitted digitally over the air by transceiver 20 and 
antenna 22. 

25 [0021] Transceiver 20 may, for example, use digital modulation techniques such as time division multiple access 
(TDMA) or of the spread spectrum type such as frequency hopping (FH) or code division multiple access (CDMA). An 
example of CDMA modulation and transmission techniques is disclosed in U.S. Patent No. 5,1 03,459, entitled "SYS- 
TEM AND METHOD FOR GENERATING SIGNAL WAVEFORMS IN A CDMA CELLULAR TELEPHONE", Issued April 
7, 1992, and assigned to the assignee of the present invention, the disclosure of which is incorporated by reference. 

30 in such a CDMA system, vocoder 1 8 is preferably of a variable rate type such as disclosed in copending US. Patent 
Application Serial No. 07/713,661 entitled "VARIABLE RATE VOCODER", filed June 11, 1991, and also assigned to 
the assignee, of the present invention, the disclosure of which is also incorporated by reference. 
[0022} Base station 30 includes among other elements not shown, antenna 32, transceiver system 34 and MTSO 
interface 36. Base station transceiver system 34 demodulates and decodes the received signals from mobile station 

35 10 and other mobile stations (not shown) and passes the them on to MTSO interface 36 for transfer to MTSO 40. The 
signals may be transferred from base station 40 to MTSO via many different methods such as by microwave, fiber 
optic, or wireline link. 

[0023] MTSO 40 includes among other elements not shown , base station interf ace 42, a plurality of vocoder selector 
cards 44A r 44N, and public switched telephone network (PSTN) interface 48. The signal from base station 30 is 
<o received at base station interface 42 and provided to one of vocoder selector cards 44A - 44N, for example vocoder 
selector card 44A. 

[0024] Each of the vocoder selector cards 44A - 44N comprises a respective vocoder 45A - 45N and a respective 
network echo canceller 46A - 46N, The vocoder decoder (not shown) contained within each of vocoders 45A - 45N 
synthesizes a digital speech signal from the respective mobile station transmitted speech parameters. These samples 
45 are then sent to the respective echo canceller 46A - 46N, which passes them on to PSTN interface 48. In this example 
the signals are provided through vocoder 45A and echo canceller 46A. The synthesized speech samples for each call 
are then passed through PSTN interface 48 into the telephone network, typically via a wireline T1 interface, I.e., a time- 
multiplexed group of 24 voice channels, to central office 50. 

[0025] Central office 50 includes among other elements not shown, MTSO Interface 52, codec 54, hybrid 56. The 
50 digital signal received at central office 50 through MTSO interface 52 is coupled to codec 54 where it is converted back 
to analog form and passed on to hybrid 56. At hybrid 56 the analog four-wire signal is converted to two-wire for trans- 
mission over the wire-pair toward land-based subscriber telephone 60. 

[0026] The analog signal output from codec 54 is also reflected off hybrid 56 due to an impedance mismatch. This 
signal reflection takes the form of an echo signal heading back toward the mobile 10. The reflection or echo path at 
55 hybrid 56 is shown by dotted arrow line 58. 

[0027] In the other direction, the two-wire analog speech signal from telephone 60 is provided to central office 50. 
At central office 50 the speech signal is converted to four-wire at hybrid 56 and is added to the echo signal traveling 
toward mobile 1 0. The combined speech and echo signal is digitized at codec 54 and passed on to MTSO 40 by MTSO 
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Interface 52. 

[0028] At MTSO 40 the signal is received by PSTN Interface 48 and sent to echo canceller 46A, which removes the 
echo before the signal Is encoded by vocoder 45A. The vocoded speech signal Is forwarded via base station Interface 
42 to base station 30 and any other appropriate additional base stations for transmission to mobile station 10. The 

5 signal transmitted from base station interface 42 is received at base station 30 by MTSO interface 36. The signal is 
passed on to transceiver system 34 for transmission encoding and modulation, and transmitted upon antenna 32. 
[0029] The transmitted signal Is received upon antenna 22 at mobile station 10 and provided to transceiver 20 for 
demodulating and decoding. The signal is then provided to the vocoder 1 8 where the synthesized speech samples are 
produced. These samples are provided to codec 16 for digital to analog conversion with the analog speech signal 

10 provided to speaker 1 4. 

[0030] In order to fully understand the echo canceller of the present invention it is helpful to examine the traditional 
echo canceller and its deficiencies when operating in a digital cellular environment. A block diagram of a traditional 
network echo canceller (NEC) 1 00 is shown in Figure 2. 

[0031] In Figure 2, the speech signal from the mobile station Is labeled as the far-end speech x(n), while the speech 

15 from the land side is labeled as near-end speech v(n). The reflection of x(n) off the hybrid Is modeled as passing x(n) 
through an unknown echo channel 102 to produce the echo signal y(n), which is summed at summer 104 with the 
near-end speech signal v(n). Although summer 104 is not an included element in the echo canceller itself, the physical 
effect of such a device is a parasitic result of the system. To remove low-frequency background noise, the sum of the 
echo signal y(n) and the near-end speech signal v(n) is high-pass filtered through filter 1 06 to produce signal r(n). The 

20 signal r(n) is provided as one input to summer 1 08 and to the near-end speech detection circuitry 110. 

[0032] The other input of summer 1 08 {a subtract input) is coupled to the output of an adaptive transversal filter 112. 
Adaptive filter 112 receives the far-end speech signal x(n) and a feedback of the echo residual signal e(n) output from 
summer 108. In cancelling the echo, adaptive filter 112 continually tracks the impulse response of the echo path, and 
subtracts an echo replica y(n) from the output of filter 106 in summer 108. Adaptive filter 112 also receives a control 

25 signal from circuitry 1 1 0 so as to freeze the filter adaptation process when near-end speech is detected. 

[0033] The echo residual signal e(n) is also output to circuitry 1 1 0 and center-dipping echo supressor 1 1 4. The output 
of supressor 114 is provided as the cancelled echo signal when echo cancellation is in operation. 
[0034] The echo path impulse response can be decomposed into two sections, the flat delay region and the echo 
dispersion, as is shown in the graph of Figure 3. The flat delay region, where the response is dose to zero, is caused 

30 by the round-trip delay for the far-end speech to reflect off the hybrid and return to the echo canceller. The echo 
dispersion region, where the response Is significant, is the echo response caused by the reflection off the hybrid. 
[0035] If the echo channel estimate generated by adaptive filter exactly matches the true echo channel, the echo is 
completely cancelled. However, the filter normally cannot precisely replicate the channel, causing some residual echo. 
Echo suppressor 114 eliminates the residual echo by passing the signal through a nonlinear function that sets to zero 

as any signal portion that falls below a threshold A and passing unchanged any signal segment that lies above the threshold 
A. Synthesized noise can be used to replace signal sections that were set to zero by the center-clipping to prevent the 
channel from sounding "dead". 

[0036] As mentioned previously, although this approach is satisfactory for analog signals, this residual echo process- 
ing causes a problem in digital telephony, where vocoders are used to compress speech for transmission. Since voo 
40 oders are especially sensitive to nonlinear effects, center-clipping causes a degradation in voice quality while the noise 
replacement causes a perceptible variation in noise characteristics. 

[0037] Figure 4 illustrates in further detail the structure of adaptive filter 112 of Figure 2. The notations in Figure 4 
are defined as follows: 

The filter order; 

The sample of far-end speech at time n; 
The k* filter tap at time n; 
The echo sample at time n; 
The estimated echo at time n; and 
The echo residual at time n. 

[0038] Adaptive filter 112 is comprised of a plurality of tapped delay elements 120 t - 120^,, a plurality of multipliers 
1 22q - 1 22 N . 1 , summer 1 24 and coefficient generator 1 26. An input far-end speech sample x(n) Is input to both of delay 
element 1 20 t and multiplier 1 22q. As the next samples come into filter 1 1 2 the older samples are shifted through delay 
55 elements 1204 - 120^, where they are also output to a respective one of multipliers 122j - 122^. 

[0039] Coefficient generator 126 receives the echo residual signal e(n) output from summer 108 (Figure 2) and 
generates a set of coefficients ho(n) - h^n). These filter coefficient values ho(n) - h^n) are respectively input to 
multipliers 122q - 122^,. The resultant output from each of multipliers 122o - 122^. is provided to summer 124 where 



45 N: 
x(n): 
h k (n): 
r(n): 
Y(n): 

so e(n) : 
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they are summed to provide the estimated echo signal y(n). The estimated echo signal y(n) is then provided to summer 
108 (Figure 2) where it is subtracted from the echo signal r(n) to form the echo residual signal e(n). In the traditionaJ 
echo canceller of Figure 2, a control input is provided to generator 126 to enable coefficient updating when no near- 
end speech is detected by circuitry 110. When doubletalk or near-end speech only is detected by circuitry 110, the 
control input disables the updating of the filter coefficients. 

[0040] The algorithm implemented in coefficient generator 1 26 for adapting the filter tap coefficients to track the echo 
path response is the normalized least-mean-square (NLMS) adaptation algorithm. Introducing for this algorithm the 
vectors: 

x(n) = [x(n) x(n-1 ) x(n-2) ... x(n-N+1 )] (1) 

h(n) = [h 0 (n)h 1 (n)h 2 (n)...h N „ 1 (n)] (2) 
the vector inner product between h(n) and x(n) is defined as: 

N-l 

<h(n) x(n)> = £h.(n)x(n-i). 0j 

The adaptation algorithm is stated as: 

h(n+1) = h(n) + u^J_e(n) x(n) (4) ' 



30 where: 



h(n) Is the tap coefficient vector, 
x(n) is the reference signal input vector, 
e(n) is the echo residual signal; 
u is the step size; and 

Byjp) is an energy estimate computed as the sum of the squares of the N most recent samples where: 
N-l 

Exx(n)= 2[x(xw)P (5) 
i=0 



[0041] The main advantages of this algorithm (4) are that it has smaller computation requirements than other adaptive 
algorfthms, and its stability properties are well-understood. Convergence can be guaranteed by an appropriate choice 
of step size (0 < u, < 2) with u, = 1 providing the fastest convergence. Smaller step sizes provide a greater degree of 
cancellation in the steady-state at the expense of convergence speed. 

[0042] It should be noted that the near-end talker speech signal v(n) is not included in the echo residual signal e(n) 
because adaptive filter 112 is disabled by near-end speech detection circuitry 110 when speech from the near-end 
50 talker is detected. 

[0043] In addition to providing the enable signal to filter 112, circuitry 110 may also generate and provide the value 
of E^n) to filter 1 1 2 in the control input Furthermore the value of u. is typically fixed in generator 1 26 or a fixed value 
provided from circuitry 110 in the control input. 

[0044] The most difficult design problem in echo cancellation is the detection and handling of doubletalk, i.e., when 
55 both parties speak simultaneously. As opposed to a voice-activated switch (VOX) that allows only simplex communi- 
cation, an echo canceller preserves duplex communication and must continue to cancel the far-end talker echo while 
the near-end speaker is talking. To prevent the filter coefficients from being corrupted by the near-end speech, the filter 
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taps must be frozen to prevent divergence from the transfer characteristics of the actual echo channel. 
[0045] Referring back to Figure 2, near-end speech detection circuitry 1 1 0 may use energy measurements of x(n), 
rCn), and e(n) to determine when near-end speech is occurring. A classical doubtetaik detection method compares 
short-term energy averages of x(n) and r(n) using the knowledge that the echo path toss across the hybrid is about 6 
dB. If the hybrid loss drops below 6 dB, near-end speech Is. declared. However, experimental studies have revealed 
that this method lacks sensitivity. The large dynamic range of the near-end speech v(n) causes this method to miss 
detection occasionally, causing the fitter coefficients to be corrupted. 

[0046] Another popular doubletalk detection method examines the short-term echo return loss enhancement (ERLE), 
which is defined as: 

ERLE (dB) = 1 0 log(o y 2 /G e 2 ), (6) 



where a y 2 is the variance of y(n), c e 2 is the variance of e(n) f and these variances are approximated using the short- 
is term energy averages: 

N-i 

<?v 2= I[y(n-i)l 2 ;and & 
y i=0 



£ 2 = lUn-i)]* (8) 

[0047] The ERLE represents the amount of energy that is removed from the echo after it is passed through the echo 
30 canceller. This doubletalk detection method compares short-term energy estimates of r(n) and e(n), and declares dou- 
bletalk if the short-term ERLE drops below some predetermined threshold such as 6 dB. Although this method provides 
greater sensitivity, it incurs a slight delay before detecting the onset of near-end speech, causing the echo channel 
estimate to be slightly corrupted before adaptation is frozen. This detriment necessitates the use of an additional tech- 
nique to remove the residual echo. It is therefore desirable to find an improved method of preserving the echo channel 
35 estimate In doubletalk such as the present invention provides. 

[0048] In using either of these energy comparison methods to detect doubletalk, high levels of background noise, 
particularly in the cellular call environment, can create difficulties in accurate doubletalk detection. It is therefore de- 
sirable to utilize an improved method for detecting doubletalk in high noise background level environments as the 
present invention provides. 

40 [0049] Referring now to Figure 5, a block diagram of an exemplary embodiment of network echo canceller (NEC) 
140 of the present invention is illustrated. In an exemplary implementation, NEC 140 is configured in digital signal 
processor form, such as a model of the TMS 320C3X series digital signal processors manufactured by Texas Instru- 
ments of Dallas Texas, it should be understood that other digital signal processors may be programmed to function In 
accordance with the teachings herein. Alternatively, other implementations of NEC 140 may be configured from discrete 

45 processors or in application specific integrated circuit (ASIC) form. 

[0050] It should be understood that in the exemplary embodiment, NEC 140 is in essence a state machine that has 
defined functions for each of the different states of operation. The states in which NEC 140 operates Is silence, far- 
end speech, near-end speech, doubletalk, and hangover. Further details on the operation of NEC 140 is descrfoed 
later herein. 

so [0051] In Figure 5, as was for Figure 2, the speech signal from the mobile station is labeled as the far-end speech 
x(n), while the speech from the land side is labeled as near-end speech v(n). The reflection of x(n) off the hybrid is 
modeled as passing x(n) through an unknown echo channel 1 42 to produce the echo signal y(n), which is summed at 
summer 144 with the near-end speech signal v(n). Although summer 144 is not an included element In the echo can- 
celler itse&, the physical effect of such a device is a parasitic result of the system. To remove low-frequency background 

55 noise, the sum of the echo signal y(n) and the near-end speech signal v(n) is high-pass filtered through filter 146 to 
produce signal r(n). The signal r(n) is provided as one input to each of summers 148 and 150, and control unit 152. 
[0052] The input far-end speech x(n) is stored in buffer 154 for input to a set of transversal adaptive filters (initial 
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fitter 156, state fitter 158 and echo canceller filter 1 60), and control unit 152. In the exemplary embodiment Initial fitter 
1 56 has 448 filter coefficients or taps while state filter 1 58 and echo canceller fitter 1 60 each have 256 taps. 
[0053] During the initial operation of NEC 140, the speech samples x(n) are provided to initial filter 156 for initial 
echo cancellation and echo delay adjustment under the control of control unit 1 52. During this period of initial operation, 
5 state fitter 1 58 and echo canceller fitter 1 60 are disabled by control un it 1 52. The initial echo cancellation output signal 
y ( (n) from initial filter 156 is provided through fitter switch 162 to summer 148. At summer 148 the signal y,(n) is sub- 
tracted from the signal r(n) to produce an initial estimate of the echo residual signal e(n). Filter switch 162, under the 
control of control unit 1 52, selects between the output of initial fitter 1 56 arid echo cancel ler filter 1 60 for input to summer 
148. 

10 [0054] As mentioned previously, an echo delay adjustment process is undertaken during the period of initial operation 
of NEC 140. In this process the filter tap coefficients or taps of initial filter 156 are provided to control unit 152 for a 
. determination of the taps of largest value. This process is used to distinguish the flat delay region from the echo 
dispersion region of the signal. 

[0055] Upon completion of the echo delay adjustment process, 256 taps from initial filter 1 56 are copied into the taps 
is of state filter 1 58 and echo canceller fitter 1 60 as described later In further detail. The result of the echo delay adjustment 
process ensures that adaptive filtering occurs on the samples x(n) which coincide with the echo dispersion region of 
the signal r(n). After this initial operation, state filter 1 58 and echo canceller filter 1 60 are enabled and initially use the 
taps provided by filter 156. All future adaptation Is based upon generated taps. 

[0056] During the period of normal operation of NEC 140, the signal y^n) is output from state filter 158 to one input 
20 of summer 150 where it is subtracted from the signal r(n). The resultant output from summer 150 is the signal e^n) 
which is input to control unit 152. The output of echo canceller filter 1 60, the echo replica signal $(n), is provided through 
filter switch 162 to one Input of summer 148 where It is subtracted from the signal r(n). The resultant echo residual 
signal e(n) output from summer 148 is fed back as an input to control unit 152. The echo residual signal e(n) as output 
from summer 148 may be provided directly as the output of the NEC 140 or through additional processing elements. 
25 As discussed later in further detail, control unit 152 also provides control over the adaptation of state filter 158 and 
echo canceller filter 1 60. 

[0057] In the present Invention a noise analysis/synthesis feature may be provided in the output of NEC 1 40. This 
feature is supported by output switch 1 64, noise analysis unit 1 66 and noise synthesis unit 1 68. Output switch 1 64 and 
noise analysis unit 1 66 both receive the output signal e(n) from summer 148. Noise analysis unit 1 66, under the control 

30 of control unit 1 52, analyzes the signal e(n) and provides an analysis output to noise synthesis unit 1 68. Noise synthesis 
unit 1628 generates a synthesized noise signal s(n) based upon the analyzed characteristics of the signal e(n). The 
output of noise synthesis unit 168 is then provided to output switch 164, Through output switch 164, which is under 
the control of control unit 152, the output of NEC 1 40 is provided either as the signal e(n) directly from summer 148 or 
the synthesized noise signal s(n) from noise synthesis unit 168. 

35 [0058] The majority of a typical phone conversation is spent in singletalk mode, when only one person is speaking 
at any time. When only the far-end speaker is talking, NEC 140 uses the noise analysis/synthesis feature to completely 
reject the echo by replacing the echo residual signal e(n) with a synthesized noise signal s(n). To prevent the far-end 
speaker from detecting any change in signal characteristics, the noise is synthesized to match the power and spectral 
characteristics of the actual background noise during the most recent period of silence using linear predictive coding 

40 (LPC) techniques. This noise synthesis method, discussed in further detail later herein, effectively eliminates singletalk 
as a design consideration so as permit the optimization of NEC 1 40 for doubletalk. Further details on the noise analysis/ 
synthesis feature is described later. 

[0059] As an additional feature of the present invention, a gain stage may also be provided as illustrated in the 
exemplary embodiment of Figure 5. In implementing this feature, variable gain element 170 is provided at the input of 

45 far-end speech signal x(n) to NEC 140. The input far-end speech signal x(n) is provided through variable gain stage 
170 to buffer 154 and unknown echo channel 142. Control unit 152 provides in combination with variable gain stage 
170 an automatic gain control feature to limit signals which would may otherwise affected in a nonDnear manner by 
unknown echo channel 142. Control unit 152 and variable gain stage 170 also serve to decrease the convergence 
time for the filter adaptation process. Again further details on this feature are described later. 

so [0060] As illustrated In the exemplary implementation of the present invention, two independently-adapting filters, 
filters 1 58 and 1 60, track the unknown echo channel. While filter 1 60 performs the actual echo cancellation, filter 1 58 
is used by the control unit 152 to determine which of several states NEC 140 should be operating In. For this reason, 
fitters 158 and 160 are respectively referred to as the state filter and the echo canceller filter. The advantage of this 
two-filter approach is that the filter coefficients of echo canceller filter 160, which model unknown echo channel 142, 

55 can be preserved more effectively without risk of degradation from near-end speech. By preserving the echo channel 
characteristics closely, the design of the present invention obviates the need for center-clipping. 
[0061] The control algorithm embodied within control unit 152, which monitors the performance of both filters 158 
and 1 60, is optimized to preserve the estimated echo channel characteristics in doubletalk. Control unit 152 switches 
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on and off the adaptation of filters 158 and 160 at the proper times, adjusts the step sizes of both filters, and adjusts 
gain unit 1 70 on x(n) to permit fast initial adaptation. 

[0062] Figure 6 illustrates (in functional block diagram form) further details of control unit 152 of Figure 5. In Figure 
6 control unit 1 52 is comprised of state machine and process control unit 1 80, energy computation unit 1 82, differential 

5 energy magnitude unit 1 84, variable adaptation threshold unit 1 86, automatic gain control unit 1 88 and flat delay com- 
putation unit 190. . 
[00631 State machine 180 performs the overall state machine function as illustrated with respect to Figure 14, ana 
various overall process control such as Illustrated with respect to Figure 7. State machine 180 provides control over 
initial filter 1 56 and flat delay computation unit 190 during the initial operation of NEC 140. State machine 1 80 provides 

10 control to state filter 1 58 and echo canceller filter 1 60 with respect to initial settings, adaptation control, and step size 
control. State machine 1 80 also provides control over noise analysis unit 1 66 and switches 1 62 and 1 64. State machine 
180 also enables variable adaptation threshold unit 186 for state machine adaptation control of echo canceller filter 
1 60. State machine 1 80 also receives the signals e(n) from summer 148 and e1 (n) from summer 1 50 for respectively 
providing to echo canceller filter 160 and state filter 158 . In the alternative the signals e1 (n) and e(n) may be provided 

15 directly to state filter 158 and echo canceller filter 160. 

[0064] Energy computation unit 1 82 receives the sample values for x(n) from circular buffer 1 54, r(n) from H PF 146, 
e(n) from summer 1 48, and e1 (n) from summer 1 50; and computes various values as discussed later herein for pro- 
viding to differential energy magnitude unit 1 84 and state machine 1 80. Differential energy magnitude unit 1 84 uses 
energy values computed in energy computation unit 1 82 for comparison with threshold levels so as to determine wheth- 

20 er near-end speech andfor far-end speech is present The result of this determination is provided to state machine 1 80. 
[0065] Energy computation unit 1 82 computes energy estimates at each step for filters 1 58 and 1 60. These energy 
estimates are computed as the sum of squares of the most recent samples. The two energy measurements, Ex(n) and 
Exx(n), on signal x(n) at time n are computed respectively over 128 and 256 samples and can be expressed according 
to the following equations: 



25 
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(9) 



E*(n)= £[x(n-i)p;and 
i=0 



255 

E„(n)= £lx(n-i)P. < 10 > 
i=0 

Similarly, energy computation unit 182 computes the energy estimates Er (n), Ee (n) and E e1 (n) at time n for the 
respective signals r(n), e(n) and e t (n) according to the following equations: 

127 

&(n)= £[r(n-i)F; . fl« 

i=0 



127 

Ee(n)= £fe(n-i)F;and £2) 
i=0 
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Eel(n)=2[el(n-i)P. 
i=0 
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Energy computation unit 182 also computes the hybrid loss at time n, Hloss(n), according to the following equation: 

Hloss(n) (dB) = 10 log 10 [ E x (nyE r (n) J. (14) 

The echo return loss enhancement (ERLE) of echo canceller filter 160 is computed by energy computation unit 182 
according to the foDowIng equation: 

ERLE(n)(dB)=10k>g 10 [E r (nyE e (n)] (15) 

with the echo return loss enhancement of stale filter 158 (ERLE1) also being computed by energy computation unit 
1 82 according to the foDowing equation: 

ERLE1(n) (dB) = 10 log 10 [ E r (nyE e1 (n) J. (16) 

[0066] To avoid nonlinearities in the echo signal caused by the echo channel, ft is desirable to limit the received value 
of sample x(n) to a value less than a preset threshold near the maximum. Automatic gain control unit 1 88 in combination 
with variable gain stage 1 70 achieve this result Automatic gain control unit 1 88, which receives the samples x(n) from 
the circular buffer, provides a gain control signal to variable gain element 170 so as to limit the sample values when 
they are excessively large. ( 

[0087] Rat delay computation unit 190 under the control of state machine 180 at the initial operation of NEC 140 
computes the flat delay from the initial filter. Fiat delay computation unit 1 90 then provides circular buffer offset infor- 
mation to state filter 158 and echo canceller 160 to account for the flat delay period for the call. 
[0068] In the exemplary embodiment of the network echo canceller of the present invention, a three-pronged ap- 
proach is used to solve the doubletalk detection/handling problems. Accordingly the present invention uses (1) two 
independently-adapting filters with different step-sizes; (2) a variable threshold to switch filter adaptation on and off; 
and (3) a differentia} energy algorithm for speech detection. 

[0069] NEC140 uses two independently-adapting NLMS adaptive filters. Unlike other two-filter approaches, the NEC 
140 does not switch back and forth between using filters 158 and 160 for the echo cancellation, nor does it exchange 
tap information between the two filters in the steady state. Both of these previously known techniques cause transients 
that lead to undesired "pops" In the output of the echo canceller. In the present invention echo canceller filter 160 
always performs the actual echo cancellation white state filter 158 is used by the control algorithm embedded within 
state machine 180 to distinguish different canceller states. This novel dual-filter approach permits the use of a con- 
servative adaptation strategy for echo canceller filter 1 60. If the control algorithm is "unsure" of which state the canceller 
should be operating in, it turns off the adaptation of echo canceller filter 1 60 while state filter 158 continues to adapt. 
State machine 1 80 uses the statistics gleaned from state filter 158 to aid in state-determination. The step sizes of the 
adaptive filters are adjusted so that echo canceller filter 1 60 obtains a high ERLE in the steady state, while state filter 
158 responds quickly to any changes in the echo channel response. By allowing the two filters 158 and 160 to simul- 
taneously adapt in the manner just mentioned, overall performance of the echo canceller is enhanced. 
[0070] State filter 1 58 and echo canceller'fUter 1 60, along with initial filter 1 56 are each constructed in a manner as 
was disclosed with reference to Figure 4. State filter 1 58 and echo canceller filter 1 60 each contain 256 taps to account 
for a 32 ms echo dispersion duration at an 8-kHz sampling rate. It should be understood that for state filter 158 and 
echo canceller filter 1 60, a greater or lesser number of taps may be used depending upon the echo dispersion duration 
and sampling rate. Sample buffer 154 contains 512 far-end speech samples to account for a 64 ms time period for the 
flat delay and echo dispersion for a caO made across the continental United States. To handle the different values of 
' flat delay encountered in individual phone caDs, the network echo canceller of the present invention automatically 
determines the flat delay and shifts the filter taps to maximize the number of taps operating on the echo dispersion 
region. The echo canceller of the present invention therefore handles echo responses ranging from 0 to 32 ms with 
no shift, up to 32 to 64 ms with the maximum delay shift It should be understood that as is well known in art with 
respect to digital signal processors, and processing techniques associated therewith, that initial filter 156 may be used 
to form filters 1 58 and 1 60. Upon completion of the initial processing initial filter 1 56 may be "broken" into the two filters 
158 and 160 with independent coefficient generators. Further details on the initial feature are discussed later herein. 
[0071] To preserve the fitter coefficients of echo canceller filter 160 at the onset of doubletalk, the NEC 140 uses a 
variable adaptation threshold (denoted VT) to switch on and off the adaptation of echo canceller filter 1 60. The variable 
adaptation threshold (VT) is computed by variable adaptation threshold unit 186 and provided to state machine 160. 
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The control algorithm permits echo canceller filter 1 60 to adapt If either of state filter 158 or echo canceller filter 1 60 
has an ERLE greater than VT. Referring back to Figure 4, the control input provided to generator 126 Includes an 
enable signal from control unit 1 52 which permits coefficient vector generator 1 26 to update the filter coefficients for 
filter adaptation. In the event that the ERLE of both filters Is less than VT, state machine 1 80 disables coefficient vector 
5 . generator 126 from providing updated coefficients. In this case coefficient vector generator 126 outputs the existing 
coefficients until adaptation is enabled once again. The control input also provides other parameters to coefficient 
vector generator 126 such as the values of |L, Exx(n) and e(n) of Equation (4). 

[0072] In Figure 6, the ERLE for state filter 1 58 is computed In energy computation unit 1 82 according to Equation 
(6) using the values of r(n) and e t (n). Similarly the computation is done in energy computation unit 1 82 for echo canceller 
10 filter 1 60 with the values of r(n) and e(n). In variable adaptation threshold unit 1 86, the VT is initialized by state machine 
180 to an Initial minimum threshold, which in the exemplary embodiment is 6 dB. The threshold processing in variable 
adaptation threshold unit 1 86 can be descrfoed by the following C-code: 

15 if (ERLE > VT + 6 dB) { 

VT = MAX ( VT, (ERLE - 6 dB) ]; 

) else if (ERLE < MT - 3 dB) { 
20 VT = MT; 

} 

[0073] As the ERLE rises past (VT + 6 dB), the adaptation threshold also rises, remaining 6 dB behind the peak 
25 E RLE. This 6 dB margin accounts for the variability of the ERLE. State machine 1 80 permits echo, canceller filter 1 60 
to continue to adapt if the ERLE of either of filters 1 58 and 1 60 is within 6 dB of the last ERLE peak. If the ERLE drops 
3 dB below the minimum threshold, the adaptation threshold is reset to the minimum threshold. The advantage of this 
approach Is that the adaptation of echo canceller filter 1 60 Is immediately halted right at the onset of doubletalk. For 
example, suppose the far-end speaker is the only one talking and the last ERLE peak is at 34 dB. Once the near-end 
30 speaker starts to talk, the ERLE falls and the filter adaptation is stopped when the ERLE hits 28 dB. Classical near- 
end speech detectors will not suspend adaptation until the ERLE falls below about 6 dB, which permits the echo channel 
estimate to be slightly corrupted. Therefore, by preserving the echo channel characteristics more closely, the present 
invention achieves greater echo rejection in doubletalk while avoiding the voice-quality degradation associated with 
center-clippers used in traditional echo cancellers. 
35 [0074] In the exemplary embodiment of the present invention It is pref erred that the ERLE of both filters 1 58 and 1 60 
drop below VT before adaptation of fitter 1 60 is halted. This characteristic of the control algorithm helps distinguish the 
onset of doubletalk from the normal variability of either ERLE measurement, because the E RLE of both filters will drop 
immediately at the onset of doubletalk. 

[0075] A further aspect of the present invention is that as filters 1 58 and 1 60 obtain convergence, the value of the 
40 minimum threshold for VT is increased from the initial setting. As the minimum threshold for VT increases, a higher 
ERLE is necessary before echo canceller filter 1 60 is adapted. 

[0076] To prevent large background noise levels from interfering with state determination, the echo canceller of the 
present Invention uses a differential energy algorithm on the signals x(n) and e(n). This algorithm, embedded within 
differential energy magnitude unit 1 84 and state machine 1 80, descrfoed in further detail later herein, continually mon- 

45 Mors the background noise level and compares it with the signal energy to determine if the speaker is talking. Differential 
energy magnitude unit 184 in the exemplary embodiment computes three thresholds T^Bj), T 2 (B^, and T 3 (B^, which 
are functions of the background noise level Bi. If the signal energy of the signal x(n) exceeds all three thresholds^ the 
speaker is determined to be talking. If the signal energy exceeds T1 and T2 but not T3, the speaker is determined to 
be probably uttering an unvoiced sound, such as the "sp" sound in the word 'speed.' If the signal energy Is smaller 

so than all three thresholds, the speaker is determined to be not talking. 

[0077] An exemplary overall flow diagram of sample data processing in the echo canceller of the present invention 
is shown below in Figure 7, The algorithm under the control of state machine 180 initially starts, block 200, and then 
first obtains the u.-law samples of x(n) and v(n), block 202, which are then converted to their linear values, block 204. 
the v(n) sample is then passed through the high-pass filter (HPF) to obtain sample r(n), block 206. The HPF, filter 146 

55 of Figure 5 which eliminates residual DC and low frequency noise, is a digital.filter constructed using well known digital 
filter techniques. The HPF is typically configured as a third order elliptic filter with the characteristics of a stopband of 
a 120 Hz cutoff with 37 dB rejection, and a passband of a 250 Hz cutoff with .7 dB ripple. The HPF is typically imple- 
mented as a cascade of a first order and second order direct-form realizations with the coefficients indicated in Table 
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I as follows: 



TABLE I 



5 



A{1) 


A(2) 


B(0) 


B(1) 


B(2) | 


-.645941 
-1.885649 


0 

.924631 


.822970 
1.034521 


-.822970 
-2.061873 


0 

1.034461 



[0078] Next, the energy averages Ex(n) and Exx(n) are updated for the signal sample x(n), block 208. The energy 
10 average Er(n) is then updated for the signal sample r(n) along with the computing of the energy loss Hloss(n) on the 
hybrid, block 210. 

[0079] The output of adaptive filter 1 58 (Figure 5), the value y1 (n) is computed, block 21 2, with the echo residual e1 
(n) then being determined, block 214. The ERLE1 and energy average E e1 for fitter 1 58 are then updated, block 21 6. 
SimJIaity the output of adaptive filter 1 60 (Figure 5), the value y(n) is computed, block 21 8, with the echo residual e(n) 
15 then being determined, block 220. The ERLE and energy average E e for filter 160 are then updated, block 222. It 
should be understood that certain of the steps set forth in blocks 208 - 222 may be performed in various other orders 
as dictated by the values required for further steps. Furthermore certain steps may be performed in parallel such as 
steps 21 2 - 21 6 and 21 8 - 222. Therefore the order discussed herein with reference to Figure 7 is merery an exemplary 
order of processing steps. 

20 [0080] Upon completion of the previous steps a parameter adjustment step is performed, block 224, with this step 
described in further detail with respect to Rgure 8. Upon completion of the parameter adjustment step a periodic 
function step is performed, block 226, with this step described in further detail with respect to Rgure 9. Upon completion 
of the periodic function step a state machine operation step is performed, block 228, with this step described in further 
detail with respect to Rgure 14. Upon completion of the state machine operation step the process repeats with a return 

25 to point A in the flow diagram. 

[0081] The flow diagram in Rgure 8 illustrates in further detail the parameter adjustment step of block 224 of Rgure 
7. In the parameter adjustment step the fitter step-size and variable threshold parameters are updated during the echo 
canceller operation. 

[0082J Both state fitter 158 and echo canceller filter 160 (Figure 5) are initialized by state machine 1 80 at the start 
30 of operation by providing in the control Input to the filter coefficient generator a step size of 1 Qi1 = u2 = 1). This 
initialization of the filters at this level permfts a fast initial convergence. Upon reaching the parameter adjustment step 
an initial parameter adjustment algorithm is utilized. In this initial algorithm a determination is made as to whether the 
control element set value of u2 for the echo canceller fitter is greater than a fixed value of 0.5, block 250. If so, a 
determination is made as to whether the ERLE is greater than 1 4 dB, block 252. If the E RLE is not greater than 14 dB, 
35 such as at the beginning of obtaining convergence of the channel, a counter (Scount counter) value is set equal to 
zero (Scount=0), block 254, and the parameter adjustment step is completed for this sample with the subroutine exited 
at point C. 

[0083] Should the ERLE be determined to be greater than 1 4 dB, the counter is incremented, block 256. A determi- 
nation is then made as to whether the Scount value has been incremented to a count value of 400, block 258. If the 
40 Scount value is less than the count value of 400 the parameter adjustment step is completed for this sample with the 
subroutine exited at point C. 

[0084] However, should the determination in block 258 result in the Scount value being found to be equal to the 
count value of 400, which corresponds to the ERLE being greater than 1 4 dB for 50 ms (consecutively), the step size 
(u.1 ) of the state fitter is shifted to 0.7 and the step size (ji2) of the echo canceller fitter is shifted to 0.4, block 260. Also 
45 in block 260 the Scount counter is reset to zero. The parameter adjustment step is then completed for this sample with 
the subroutine exited at point C. 

[0085] If in block 250 the control element set value of u2 for the echo canceller fitter is determined to be not greater 
than a fixed value of 0.5, an intermediate algorithm Is invoked. In this intermediate algorithm a determination is made 
as to whether the value for u2 is greater than 0.2, block 262. If so, a determination is made as to whether the ERLE is 

so greater than 20 dB, block 264. If the ERLE is not greater than 20 dB the Scount value is set equal to zero (Scount=0), 
block 266, and the parameter adjustment step is completed for this sample with the subroutine exited at point C 
[0086] Should the ERLE be determined to be greater than 20 dB, the counter is incremented, block 268. A determi- 
nation is then made as to whether the counter value has been incremented to a count value of 400, block 270. If the 
counter value Is less than the count value of 400 the parameter adjustment step is completed for this sample with the 

55 subroutine exited at point C. 

[0087] However, should the determination in block 270 result in the Scount counter value being found to be equal 
to the count value of 400, which corresponds to the ERLE being greater than 20 dB for 50 ms, the value u.1 is shitted 
to 0.4, with the value u2 shitted to 0.1, block 272. Further in block 272 the minimum threshold is increased from the 
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with the subroutine exited at point C. erT , nIU>rKt _ n slzes permits higher ERLE levels to be used. 

[0088] itshouldbe noted that -gearshifting- of the Mm to smal ^rstqpstees pe rm g m& attains 

However, in the preferred embodiment ^^S todCJ^ SSa^nei response. 

a high steady-state ERLE, and the state Z^*"^ ^mi^ 0^ tte wriable adaptation threshold algorithm goes 

P0891 Anertheec*o<*r**llerfi^ 

Irrtoeffecttopreservemeechoch^ 

compteBO lor Ws oamplo wtt 0» suWouHne etfeO , a ^"°' 1-oM „,^ lVI p UK6a B.,a eu , m i M lk>i.lsina* 
[0090] H^r.^bU^OWERLEfc*.^^ 

to the state machine processing in Figure 14. nroeonpft f terae near ^ n d background noise, 

0092] To promote a fast transition into the steady stale, = JdB) during far- . 

he echo caLier of the present invent 70 This initial % dB 
end speeoh. As shown in Figure 5, state ^acNne , 1 80 ^SKJ^SSKId increases by 3 dB) which 
gain increases the size of the echo received at r(n) ^a to then near-e no no i 

aftowsfaster.nHUric^^ 
S^s^ 

made to automatically avoid cfipping. ^r^lfxSthaf^ 
typical* rangebetween -8031^ 
vie of ^1,or-8031.the samples retum.ngfrommeh^ 

solve this problem, the echo cancels of the present .nventton uses «*J^^J whenever the absolute 

when the far-end talker Is shouting. eton teramoiorted the periodic function computation 

[0094] ReferringbacktoFigumr.anertheparameterad^ 

step is performed. Rgure 9 illustrates the three computations tha tare p %\™ axA oconeLo n and Durbin 
cZoutation stee ; (1) the differential energy magnitudes of signals x(n) and e(n). (2) the < r°™ Teiai 
« ^^lU*. and (3) the ta^hifu no ^« a ^ 1™^^ *** 

[0095] in Rgure 9, the period function completion ^^^^^^ be performed. Regardless 

Uaker Is speaking. The DEM(x) is in the preferred ambient P™*^^^ computation unit 1 82 of 

noise level XBj, block 302. Cflmn u. where the next update XB^ is 

55 [00971 in this step the background noise estimate ts imputed every 128 saxrples.wnere in 
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computed as: 
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XB M = min ( Ex, 160000, max (1.00547XB,, XB,+1)). (17) 
The three thresholds values are computed as a function of XB| as follows: 

T, PCB^ -(3.1 60500x1 0'^X^+1 0.35 XB,+ 704.44; (18) 

T 2 (XB,) = -(7.93881 6x1 0" 4 ) XB, 2 + 26.00 XB, + 1 769.48; (1 9) 

and 

T 3 (XB,) = -(3.1 60500x1 0" 4 )XB, 2 +1 03.5 XB, + 7044.44. (20) 

[0098] The energy Ex of the far-end signal Is again compared with these three thresholds. If Ex is greater than all 
three thresholds, DEM(x) » 3, indicating that speech Is present. If Ex is greater than Tl and T2 but not T3, then DEM 
(x) = 2, Indicating that unvoiced speech Is probably present. If Ex Is greater than Tl but not T2 and T3, DEM(x) = 1 . 
And finally If Ex is less than all three thresholds, DEM(x) = 0, Indicating that no speech Is present The value of DEM 
(x) Is provided from differential energy magnitude unit 1 84 to state machine 180. 

[0099] Similarly, the differential energy magnitude of signal e, DEM(e), is computed and used to determine whether 
the near-end speaker is speaking. The DEM(e) is in the preferred embodiment also provided as an integer value In 
the range of [0,3]. The DEM(e) is determined by comparing the energy E e of the signal e(n), provided from energy 
computation unit 182 of Figure 6, with the following three computed thresholds in block 304: 

T^EB,)^ -(6.930766 X10" 6 ) EB, 2 + 4.0471 52 EB, + 289.7034; (21) 
T 2 (EB|) = -(1 .912166 X10*) EB, 2 + 8.750045 EB, + 908.971 ; (22) 

and 



T 3 (EB,) = -(4.94631 1 x1 0' 5 ) EB, 2 + 1 8.89962 EB, + 2677.431 (23) 

where the background noise estimate of signal e(n) is also updated every 128 samples as: 

EB k1 = min ( Ee, 160000, max (1.00547EB,, EB,+1)). (24) 

[0100] If Ee is greater than all three thresholds, DEM(e) = 3, indicating that near-end speech is present, if Ee is 
greater than T1 and T2 but not T3, then DEM(e) = 2, indicating that unvoiced near-end speech is probably present. If 
Ea is greater than T1 but not T2 and T3, DEM(e) = 1 . And finally if Ee is less than all three thresholds, DEM(e) = 0, 
indicating that no speech is present. The value of DEM(e) is also provided from differential energy magnitude unit 184 
to state machine 160. 

[01 01 ] Once the values of DEM(x) and DEM(e) are computed, the values of XB, and EB, are updated per Equations 
(17) and (24) in block 306. ft should be noted that both XB, and EB, are initialized to a value of 160000. 
[0102] By using differential energy measurements that track the background noise level, an accurate determination 
of whether someone is speaking can be made even in high levels of background noise. This aids state machine 180 
in Figure 6 in making correct state determinations. 

[01 03] As mentioned previously, a noise analysis computation is performed in the periodic function computation step. 
When the function select, block 300, detects that the state machine is of a state "0" for the current sample, a determi- 
nation is made as to whether the last 256 samples, including the current sample, were all of a state machine state "0", 
block 308. If so a linear predictive coding (LPC) method, traditionally used for vocoding speech, is used to compute 
the spectral characteristics of the noise. However if all of these samples were not of state "0" the LPC method is skipped. 
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[0104] The LPC method models each sample as being produced by a linear combination of past samples plus an 
excitation. When neither speaker Is talking, the error signal e(n) is passed through a prediction error filter (noise analysis 
element 1 66 of Figure 5) to remove any short-term redundancies. The transfer function for this filter is given by the 
equation: 

5 

P 

A(z) = l-£a i2 -i (25) 
10 i=l 

where the order of the predictor In the exemplary embodiment is 5 (P=5). 

[01 05] The LPC coefficients, ai, are computed from a block of 1 28 samples using the autocorrelation method, block 
31 0, with Durbln's recursion, block 31 2, as discussed in the text Digital Processing of Speech Signals by Rabiner & 
15 Schafer, which is a well known efficient computational method. The first 6 autocorrelation coefficients R{0) through R 
(5) are computed as: 

127-k 

20 R(k)= ]>(m) e(m+k). (26) 

nv=0 

[0106] The LPC coefficients are then computed directiy from the autocorrelation values using Durbln's recursion 
25 algorithm. The algorithm can be stated as follows: 
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E (0) =R(0),i = 1 . (27) 
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(2) ki= R(i)-X«, (M) R(H) 
I H J 
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/ EO-1) (28) 



a^k, (279) 



a J (l) = a, 0 - 1) -k l ot 4 (K1) 1<=J<=M (30) 



If l<P then goto (2) with fc=M . (32) 
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(7) The final solution for the LPC coefficients is given as 

a j = o ) <P) 1 <=j<=P (33) 

[0107] Once the LPC coefficients are obtained, synthesized noise samples can be generated with the same spectral 
characteristics by passing white noise through the noise synthesis filter (noise synthesis element 1 68 of Figure 5) given 
by: 



a£) = F («) 



which is just the inverse of the fitter used for noise analysis. 

[01 08] it shou Id be understood that in the exemplary embodiment, LPC coding techniques provide a excellent method 
20 for modeling the noise. However other techniques can be used for modeling the noise, or no noise modeling may be 
used at all. 

[0109] As a further function of the periodic function computation step, a tap shifting algorithm is employed to account 
for varying echo delays. This computation is performed upon initial sample processing for a call, and optionally upon 
every 256 samples, provided that the ERLE is greater than 1 0 dB, block 31 4. Should the ERLE be greater than 1 0 dB, 

25 an indication that some cancellation Is present, the largest tap, i.e., filter coefficient of the largest value, in the Initial 
filter (filter 156 of Figure 5) is determined, block 316, in flat delay computation unit 190 of Figure 6. A shifting of the 
taps is then undertaken to process a greater number of the samples from the echo dispersion region and lesser from 
the flat delay region, block 318. The shifting of the taps is a determined placement of a greater number of echo dispersion 
region samples from the buffer to the state filter and echo canceller filter than would normally occur. A recomputation 

30 of the energy averages on these samples is undertaken, block 320. Once the tap shifting algorithm is completed or 
any of the other two computations of the periodic function computation step are completed the Fcount is incremented, 
block 322 and the subroutine exited. 

[01 1 0] With respect to the echo delay adjustment, since the distance between the echo canceller at the base station 
and the hybrid in the telephone network can vary widely between calls, the flat delay of the echo signal also has a wide 
35 range. We can quickly estimate the range of this delay by assuming that the U.S. is 3000 miles across arid electrical 
signals propagate at 2/3 the speed of light. Since the round-trip distance is 6000 miles, the maximum flat delay is 
approximately: 

40 [ (6000 miles) x (1609.34 meters/mile) ] _ 48 3 ^ ^ 

[2x10 5 meters/ms] 

u 

[0111] The network echo canceller of the present invention accounts for the different values of flat delay found in 
different calls so that more taps operate on the echo dispersion region instead of being ■wasted' on the flat delay 

45 region. For example, in a traditional echo canceller with no tap-shifting mechanism, a flat delay of 16 ms would cause 
the first 1 28 taps of the echo canceller to be close to zero because the 1 28 most recent samples in the filter delay line 
are not correlated with the echo sample entering the canceller. The actual echo signal would therefore only be cancelled 
by the remaining 128 taps. In contrast, the NEC of the present invention automatically determines that the flat delay 
is 1 6 ms and shifts the taps to operate on older samples. This strategy utilizes more taps on the echo dispersion region, 

50 which results in better cancellation. 

[0112] The NEC of the present invention stores 512 samples of the far-end speech x(n) in a circular buffer (buffer 
154 of Figure 5), which corresponds to a delay of 64 ms. When the canceller starts up, it initially adapts, in initial filter 
156 of Figure 5, 448 filter taps on the 448 most recent samples as shown in Figure 1 o; 

[01 1 3] After obtaining initial convergence with the taps in this position, the algorithm determines the flat delay within 
55 fiat delay computation unit 1 90 by finding the largest tap value and its respective position in the tap buffer of the initial 
filter 156. The tap number of the largest tap (denoted Tmax) corresponds to the flat delay because It is the time (in 8 
kHz samples) for a far-end speech sample to be output from the echo canceller, reflect off the hybrid, and return to the 
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input of the echo canceller. Instead of shifting the taps by Tmax, the algorithm leaves a safety margin of 32 samples 
in case the echo channel response changes slightly. The actual tap shift value Is given by: 

TshHt=MAX[0 ( MIN(Tmax-32,256)]. (36) 

s 

[0114] Once Tshift Is determined, the initial filter taps starting from Tshift are copied into both of the state filter and 
the echo canceller filter by flat delay computation unit 1 90 as illustrated in Figure 11 . An offset by Tshift into the circular 
buffer is used so that the zeroth filter tap of both the control filter and the echo canceller fitter lines up with the sample 

10 that arrived Tshift places before the most recent sample. Figure 12 illustrates the maximum shift so as to permit an 
echo coverage of 64 ms. After the taps have been shifted to operate on older samples, the energy measurements Ex 
(n) and Exx(n) are correspondingly modified to measure the sum of squares of these older samples. 
[0115] ' As described herein forpurposes of Blustration, three adaptive filters have been described. However, It should 
be understood that in the various implementations, particularly in a digital signal processor, that the initial filter may 

is also function as the state filter and the echo canceller filter using the same physical memory. 

[01161 Upon exiting of the periodic function computation step at point D, Figures 7 and 9, a state machine control 
algorithm Is executed by state machine 1 80 (Figure 6). The state machine control algorithm can be modeled as a state 
machine with five states, as shown in Figure 13. The state machine control algorithm as embodied in state machine 
1 80 can result in a change in state with each new sample. 

20 [01 171 Stat© °. D,ock te the sBence state » where nelther s P eaker te telkln 9- Neither the state filter or the echo 
canceller fitter adapts In this state to prevent divergence from the echo channel. If the NEC remains in state 0 for 256 
consecutive sample times, the control algorithm initiates the noise analysis routine in Figure 9, to code the frequency 
characteristics of the background noise using LPC analysis. 

[01 1 8] If the far-end speaker is the only one talking, the NEC enters state 1 , block 332, In which the state fitter always 
25 adapts. The echo canceller fitter adapts If the ERLE of either fitter is above the adaptation threshold VT. The noise 
synthesis routine generates noise (using the LPC coefficients obtained during the last interval of silence) to replace 
any residual echo. In effect, the NEC has infinite ERLE in state 1 because no matter how loud the far-end speech x 
(n) is, the echo residual wiO never be passed back to the mobile. 

[0119] If the near-end speaker is the only person talking, the NEC enters state 2, block 334. Here, the state machine 
so freezes adaptation of both filters and outputs the signal e(n). If the near-end speaker stops talking, the NEC transitions 
to state 4 (hangover), with a hangover of 50 ms in the exemplary embodiment, before transitioning to state 0 (silence). 
This hangover accounts for possible pauses in near-end speech, if the far-end speaker starts to talk, the NEC transitions 

to state 3 (doubletalk). . 

[0120J In state 3, block 336, which is the doubletalk state, the state machine freezes adaptation of the echo canceller 

35 fitter and outputs e(n). If the hybrid loss Is above 3 dB, the state machine control algorithm permits the state filter to 
adapt to account for a possible change in the echo channel impulse response. For example, suppose both fitters are 
converged, the far-end speaker is the only one talking, and the echo channel changes abruptly. This situation might 
occur, for example, if someone picks up an extension phone so that the mobile station speaker is talking to two people 
on the land-telephone side simultaneously. In this case the ERLE of both filters would suddenly drop and the NEC 

40 would shift to the doubletalk state, mistaking the echo signal for near-end speech. Although both fitters would normally 
be frozen in doubletalk, in this case If both fitters are not allowed to adapt, the NEC wiD remain in this state untO the 
call terminates. However the N EC uses the hybrid loss to determine whether the state fitter Is allowed to adapt As the 
state filter adapts, its ERLE wiD rise as it reacquires the new echo channel, and the NEC will recover out of state 3 
(doubletalk). As shown in the state diagram, the only way to exit state 3 (doubletalk) is through the state 4 (hangover), 

45 which is only entered if the hybrid loss is greater than 3 dB and the ERLE of either the state filter or the echo canceller 
fitter is above the minimum threshold MX ( . , 

[0121J State 4, block 338, is a hangover state that accounts for pauses in near-end speech, if the far-end talker is 
speaking and near-end speech is not detected for 100 ms In the exemplary embodiment, the NEC transitions from 
state 4 (hangover) to state 1 (far-end speech), rf thef ar-end talker is not speaking and near-end speech is not detected 

so for 50 ms in the exemplary embodiment, the NEC transitions from state 4 (hangover) to state 0 (silence). If near-end 
speech is detected, the control algorithm returns the NEC to either to state 2 (near-end speech) or state 3 (doubletalk). 
[0122] A detailed flow diagram of the NEC state machine control algorithm is shown below in Figure 14. In Figure 
14 the algorithm is executed for each sample with a preliminary determination as to whether the current state is state 
1 (far-end speech), block 340. If the current state is determined to be state 1 and the value of Hloss is determined to 

55 be less than 3 dB, block 342, then the control element permits an output of the value e(n), block 344. This case is 
indicative of the condition where for the previous sample far-end speech was present, but for the current sample dou- 
bletalk Is present. Similarly, should the current state be determined to be neither of states 1 , 2, or 3, (far-end speech, 
near-end speech and doubletalk ) respectively in blocks 340, 346, and 348, the value of e(n) is permitted to be output, 
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block 344, with output control provided by the state machine. A determination is then made as to the next state the 
NEC Is to be in for processing the next sample, with the next state determination, starting at point E In the control state 
machine algorithm. 

[0123] Returning to block 340, if the current state Is determined to be state 1 (far-end speech), and Hloss Is deter- 
5 mined to be greater than 3 dB, block 342, the state filter is permitted to adapt, block 350. The ERLE and ERLE1 are 
then checked against VT and If either one Is greater than VT, blocks 352 and 354, then the echo canceller filter is 
permitted to adapt, block 356. However should in both blocks 352 and 354 the ERLE and ERLE1 not be greater than 
VT, the echo canceller filter is not adapted. In either case a synthesized noise sample is generated, block 358, by the 
synthesized noise element under the control of the control element using the LPC coefficients obtained during the last 
10 interval of silence. The synthesized noise sample s(n) Is output, block 360, with output control provided by the control 
element A determination is then made as to the next state the NEC is to be in for processing the next sample, with 
the next state determination starting at point E. 

[0124] At point E the program execution enters a next state subroutine. Should the value of DEM(x) not be greater 
than or equal to the integer value of 2, block 362, a check Is made to determine if DEM(e) is less than or equal to 1 , 
is block 364. If DEM(e) is not less than or equal to 1 then the state machine transitions to a next state of 2 (near-end 
speech), block 366. However, should DEM(e) be less than or equal to 1 then the state machine transitions to a next 
state of 0 (silence), block 368. Whether a transition is made to state 2 or 0, the routine proceeds to point F in the state 
machine control algorithm for hangover determination. 

[0125] However, upon entering the next state subroutine at point E should the value of DEM(x) be greater than or 
20 equal to 2, block 362, the value of DEM(e) Is determined if ft Is equal to 3, block 370. If not, the next state is determined 
to be 1 (far-end speech), block 372, and the routine proceeds to point F in the control state machine algorithm for 
hangover determination. If in block 370 the value of DEM(e) is determined to be equal to 3, then a check is made to 
determine If each of Hloss, ERLE, and ERLE1 is less than 3 dB, blocks 374, 376 and 378. If in blocks 374, 376 and 
378, any one of the values is less than 3 dB the next state is determined to be state 3 (doubletalk), block 380. However, 
25 if in blocks 374, 376 and 378, each value is greater than or equal to 3 dB, the next state is determined to be state 1 
(far-end speech), block 372. From block 380 and block 372 as before the routine proceeds to point F in the control 
state machine algorithm for hangover determination. 

[0126] Returning back to block 346, where entry is made to this block fJ the current state Is determined not to be 
state 1 (far-end speech) In block 340,' the determination is made if the current state is state 2 (near-end speech). If the 

30 current state is state 2 then the value of e(n) is output, block 382. A determination is then made as to the next state 
by first determining if DEM(x) is equal to 3, block 384, and if so the next state is set to state 3 (doubletalk), block 386. 
However if DEM(x) is not equal to 3 a determination is made if DEM(e) is greater than or equal to 2, block 388. 
[0127] If in block 388 DEM(e) is determined to be greater than or equal to 2 the next state is set to remain as the 
current state, state 2 (near-end speech), block 390. However, if in block 388 DEM(e) is determined not to be greater 

35 than or equal to 2 a determination is made whether DEM(x) is less than or equal to 1 , block 392. if in block 392 DEM 
(x) is determined not to be less than or equal to 1 then the next state is set to be state 3 (doubletalk), block 386. Should 
in block 392 DEM(x) be determined to be less than or equal to 1 then the next state is set to be state 4 (hangover), 
block 394. Additionally in block 394 an internal counter, Hcounter (not shown), in the control element is set to a Hcount 
value of 4O0. From blocks 386, 390 and 394 the routine proceeds to point F in the control state machine algorithm for 

40 hangover determination. 

[0128] Returning back to block 346, if the result of the determination is that the current state is not state 2 (near-end 
speech) a determination is made in block 348 If the current state is state 3 (doubletalk). If the current state is state 3 
then the value of e(n) is output, block 396. A determination is then made as to the next state by first determining if 
DEM(x) is equal to 3, block 398, and if not the routine proceeds to block 388 for state determination as discussed 

45 above. However If DEM(x) is equal to 3 a determination is made if Hioss is greater than 3 dB, block 400. If in block 
400 Hloss is not greater than 3 dB, the next state is set to state 3 (doubletalk), block 386. Should Hloss be greater 
than 3 dB then the state filter is permitted to adapt, block 402. 

[0129] Upon permitting the state filter to adapt, a determination is made whether ERLE is greater than MT, block 
404, and not then a determination is made whether ERLE1 is greater than MT, block 406. If either ERLE or ERLE1 
so is greater than MT then the next state is set to state 4 (hangover), block 408. However if ERLE1 is not greater than 
MT then the next state is set to state 3 (doubletalk), block 386. If the next state is set to state 4 in block 408 the Hcount 
is set to 800. From blocks 386 and 408 the routine proceeds to point F in the state machine control algorithm for 
hangover determination. 

[0130] The hangover routine ensures that a delay occurs between the transition from a near-end speech state or a 
55 doubletalk state to a state of far-end speech or silence. Once the hangover determination routine is entered at point 
F, a determination is made as to whether the current state is state 4 (hangover), block 410. Should the current state 
not be state 4 the state machine control algorithm routine is exited, with the routine returning to point A of Figure 7. . 
[0131] Should in block 41 0 the current state be determined to be state 4, a determination is made if the next state 
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has been set to a state less than state 2, i.e. state 1 (far-end speech) or state 0 (silence), block 412. If the next state 
is determined in block 412 not to be state 0 or 1, the state machine control algorithm subroutine is exited, wfth the 
subroutine returning to point A of Figure 7. However, should the next state be determined to be state 0 or 1 , the Hcount 
is decremented, block 414, with a determination then made if the Hcount is equal to 0, block 416. If the Hcount is 

5 determined to be equal to 0 then the state machine control algorithm subroutine is exited, with the subroutine returning 
to point A of Bgure.6. However If the Hcount is not equal to 0 then the next state is set to state 4, block 41 8, and the 
state machine control algorithm subroutine is exited, with the subroutine returning to point A of Figure 7. 
[0132] It should be understood that many of the parameters as discussed with respect to the exemplary embodiment 
may be modified within the scope of the teachings of the present invention. For example, the hangover delay may be 

io changed as may be other parameters, such as thresholds values, the number of threshold levels or filter step size 
values. 

[01 33J The previous description of the preferred embodiments is provided to enable any person skilled in the art to 
make or use the present invention. The various modifications to these embodiments will be readily apparent to those 
skilled in the art, and the generic principles defined herein may be applied to other embodiments without the use of 
is the inventive faculty. Thus, the present invention is not intended to be limited to the embodiments shown herein but is 
to be accorded the widest scope consistent with the principles and novel features disclosed herein. 

SUMMAFlY OF THE INVENTION 

20 [0134J 

1. An echo canceller for cancelling In a return channel signal an echoed receive channel signal where the echoed 
receive channel signal is combined by an echo channel with an input return channel signal, the echo canceller 
comprising: 

25 

first filter means for generating first filter coefficients, generating a first echo estimate signal with the first filter 
coefficients, and updating the first filter coefficients in response to a first filter control signal; 
first summing means for subtracting the first echo estimate signal from a combined return channel and echo 
receive channel signal to generate a first echo residual signal; 

so second filter means for generating second filter coefficient, generating a second echo estimate signal with the 

second filter coefficients, and updating the second filter coefficients in response to a second filter control signal; 
second summing means forsubtracting the second echo estimate signal from the combined signal to generate 
a second echo residual signal, and providing upon the return channel the second echo residual signal; and 
control means for determining from the receive channel signal, the combined signal, and the first and second 

35 echo residual signals, one of a plurality of control states wherein a first control state Is indicative of a receive 

channel signal above a first predetermined energy level, wherein when the control means Is in the first control 
state generating the first control signal and generating the second control signal when at least one of a first 
energy ratio of the first echo residua] signal and the combined signal and a second energy ratio of the second 
echo residual signal and the combined signal exceed a first predetermined energy ratio level. 

40 

2. The echo canceller of 1 wherein the control means when in the first control state determines the first predeter- 
mined energy ratio level by, determining if the second energy ratio is greater than a sum of a first threshold value 
and a first predetermined fixed value, and If so setting the first predetermined energy ratio level to the greater o: 
the first threshold value and a difference of the second energy ratio and the first predetermined fixed value, and 

45 rf the second energy ratio is less than the sum of the first threshold value and the first predetermined fixed value, 

and setting the first predetermined energy ratio level to a second predetermined fixed value when the second 
energy ratio is less than the difference between the second predetermined fixed value and a third predetermined 
fixed value. 

so 3. The echo canceller of 1 wherein the control means further determines a second control state in the plurality of 

control states, the second control state indicative of the input return channel signal above a second predetermined 
energy level, and when the control means is in the second control state inhibiting the generation of both the first 
and second control signals. 

55 4. The echo canceller of 1 wherein the control means further determines a second control state in the plurality of 

control states, the second control state Indicative of the receive channel signal above the first predetermined energy 
level and the input return channel signal is above a second predetermined energy level, and when the control 
means is in the second control state generating the first control signal. 
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5. The echo canceller of 4 wherein the control means when in the second control state generates the first control 
signal when a ratio of the receive channel signal energy and combined signal is greater than a third predetermined 
energy ratio level. 

5 6. The echo canceller of 1 further comprising output means for generating a noise signal, providing the noise signal 

in replacement of the second echo residual signal upon the return channel in response to a noise select signal, 
wherein the control means when in the first control state further generates the noise select signal. 

7. The echo canceller of 6 wherein the control means when in the first control state generates the noise select 
10 signal when a ratio of the receive channel signal energy and combined signal is greater than a third predetermined 

energy ratio level. 

8. The echo canceller of 7 wherein the control means further determines a second control state in said plurality of 
control states, the second control state indicative of both the receive channel signal and the input return channel 

15 signal respectively below second and third predetermined energy levels, and when the control means is In the 

second control state inhfoiting the generation of the first and second control signals and wherein the output means 
comprises: 

noise analysis means for, when the control means is in the second control state, performing a linear predictive 
coding analysis of the second echo residual signal and providing an analysis output; 
noise synthesis means for receiving the analysis output and synthesizing the noise signal representative of 
the second echo residual signal; and 

switch means for providing an output of the second echo residual signal upon the return channel and respon- 
sive to the noise select signal for providing the noise signal upon the return channel in replacement of the 
second echo residual signal. 



Claims 

30 1 . A method of controlling echo cancellation, comprising the steps of: 

• filtering a far-end speech signal with a plurality of adaptive filters: 

• determining if one or both of a near-end speech signal and a far-end speech signal are present based on 
35 a result of said step of filtering: 

selecting one cancellation state from a plurality of predetermined states, based on said step of deter- 
mining; 

performing echo cancellation on said far-end speech signal to generate an echo cancellation signal 
based on said step of selecting; and 
outputting said echo cancellation signal. 

2. The method of daim 1 wherein said step of performing echo cancellation comprises the steps of: 

45 • filtering said far-end speech signal using a first adaptive filter to generate an echo estimate signal; 

• subtracting said echo estimate signal from a combined echo and near-end speech signal to generate a residual 
signal; and 

• generating said echo cancellation signal based on said residual signal and said cancellation state. 
so 3. The method of claim 2 wherein said step of determining comprises the steps of: 

• filtering said far-end speech signal using a second adaptive filter to generate a state signal; and 

• determining if one or both of said near-end and far-end speech signals are present in accordance with said 
state signal and said echo estimate signal. 

55 

4. The method of claim 3 wherein said step of selecting selects a far-end speech state indicative of the presence of 
only a far-end speech signal, further comprising the steps of: 
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• measuring the energy of said far-end speech signal; 

• measuring the energy of said combined echo and near-end speech signal; 

• forming a ratio of said far-end speech signal energy and said combined signal energy; 

• comparing said ratio with a first predetermined threshold value; 

• updating coefficients of said second adaptive filter if said ratio is greater than said first predetermined threshold 
value; and 

• suspending update of said second adaptive filter ft said ratio is less than or equal to said first predetermined 
value. 

5. The method of claim 4 wherein said step of selecting selects said far-end speech state, further comprising the 
steps of: 

• * determining a first echo return loss produced by said first adaptive filter; 

• determining a second echo return toss produced by said second adaptive fitter; 

• comparing said first echo return loss with a second predetermined threshold value; 

• comparing said second echo return toss with a third predetermined threshold value; and 

• updating coefficients of said first adaptive filter if said first echo return loss exceeds said second predetermined 
threshold value or if said second echo return toss exceeds said third predetermined threshold value. 

6. The method of claim 5 wherein said step of selecting selects said far-end speech state and said step of performing 
generates a synthesized background noise signal as said echo cancellation signal. 

7. The method of claim 3 wherein said step of selecting selects a near-end speech state indicative of the presence 
of only said near-end speech signal and said step of performing generates said residual signal as said echo can- 
cellation signal. 

8. The method of claim 3 wherein said step of selecting selects a doubletalk state indicative of the presence of both 
said near-end speech signal and said far-end speech signal and said step of performing generates said residual 
signal as said echo cancellation signal. 

9. The method of claim 1 further comprising the steps of: 

• after outputting said echo cancellation signal, again determining if one or both of a near-end speech signal 
and a far-end speech signal are present; and 

• selecting an updated cancellation state from said plurality of predetermined states based on said step of again 
determining. 

10. The method of claim 9 wherein said step of again determining the presence of said far-end speech signal comprises 
the steps of: 

• measuring the energy of a far-end signal; 

• measuring the energy of a background noise signal of said far-end signal; 

• determining at least one far-end threshold value in accordance with said far-end background noise energy 
value; 

• comparing said far-end signal energy with said far-end background noise signal energy to determine a far- 
end differential energy value; and 

• determining whether said far-end speech signal is being provided to said echo canceller in accordance with 
said far-end differential energy value and said at least one farnend threshold value. 

11. The method of claim 10 wherein said step of again determining the presence of said near-end speech signal 
comprises the steps of: 

• generating an echo estimate signal using said second adaptive filter; 

• generating a residual signal by subtracting said echo estimate signal from a combined echo channel and near- 
end speech signal; 

• measuring the energy of said residual signal; 

• measuring the energy of a background noise signal of said residual signal; 

• determining at least one near-end threshold value in accordance with said residual background noise energy 
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value; 

• comparing said residua! signal energy with said residual background noise signal energy to determine a re- . 
sidual differential energy value; and 

• determining whether said near-end speech signal is being provided to the echo canceller in accordance with 
5 said residual differential energy value and said at least one near-end threshold value. 

1 2. The method of claim 11 wherein when said cancellation state is a near-end speech state indicative of the presence 
of only said near-end speech signal and said far-end differential energy value is equal to a far-end speech indication 
value, said step of selecting an updated cancellation state selects a doubletalk state indicative of the presence of 

10 both said near-end speech signal and said far-end speech signal; 

when said cancellation state is said near- end speech state, said far-end differential energy value is not equal 
to said far-end speech indication value, and said residual differential energy value is greater than or equal to 
a near-end transition value, said step of selecting an updated cancellation state again selects said near-end 
speech state; and 

when said cancellation state is said near-end speech state, said far-end differential energy value is less than 
said far-end speech indication value, and said residual differential energy value is less. than said near-end 
transition value, said step of selecting an updated cancellation state selects a hangover state indicative of a 
pause in said near-end speech signal. 

zo 

13. The method of claim 11 wherein when said cancellation state Is a far-end speech state indicative of the presence 
of only a far-end speech signal, said far-end dfff erential energy value is greater than or equal to a far-end transition 
value, and said residual differential energy value is less than a near-end transition value, said step of selecting an 
updated cancellation state again selects said far-end speech state; 

25 

• when said cancellation state is said far-end speech state, said far-end differential energy value is less than 
said far-end transition value, and said residual differential energy value is less than said near-end transition 
value, said step of selecting an updated cancellation state selects a silent state indicative of the absence of 
said near-end speech signal* and said far-end speech signal; and 

30 • when said cancellation state is said far-end speech state, said far-end differential energy value is less than 

said far-end transition value, and said residual differential energy value is greater than or equal to said near- 
end transition value, said step of selecting an updated cancellation state selects a near-end speech state 
indicative of the presence of only a near-end speech signal. 

35 14. The method of claim 11 wherein said cancellation state is a far-end speech state indicative of the presence of only 
a far-end speech signal, said far-end differential energy value is greater than or equal to said far-end transition 
value, and said residual differential energy value is equal to said near-end speech indication value, further com- 
prising the steps of: 

determining a first echo return loss produced by said first adaptive filter and a second echo return loss produced 
by said second adaptive filter; 

comparing said first echo return loss with a fourth predetermined threshold value and comparing said second 
echo return loss with a fifth predetermined threshold value; and 

wherein, when said first echo return loss is less than said fourth predetermined threshold value and said 
second echo return loss is less than said fifth predetermined threshold value, said step of selecting an updated 
cancellation state selects a doubletalk state indicative of the presence of both said near-end speech signal 
and said far-end speech signal; and when said first echo return loss is greater than or equal to said fourth 
predetermined threshold value and said second echo return loss is greater than or equal to said fifth prede- 
termined threshold value, said step of selecting an updated cancellation state again selects said far-end speech 
state. 

15. The method of claim 11 , wherein when said cancellation state is a doubletalk state indicative of the presence of 
both said near-end speech signal and said far-end speech signal, said far-end differential energy value is less 
than a far-end speech indication value, and said residual differential energy value is greater than or equal to a 
near-end transition value, said step of selecting an updated cancellation state selects a near-end speech state 
indicative of the presence of only said near-end speech signal; and 

• when said cancellation state is said doubletalk state, said far-end differential energy value is less than a far- 
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end transition value, and said residual differential energy value Is less than said near-end transition value, 
said step of selecting an updated cancellation state selects a hangover state indicative of a pause in said near- 
end speech signal. 

16. The method of claim 11 , wherein said cancellation state Is a doubletalk state Indicative of the presence of both a 
near-end speech signal and a far-end speech signal, and said far-end differential energy value is equal to a far- 
end speech indication value, further comprising the steps of: 

• updating the coefficients of said second adaptive filter; 

• determining a first echo return loss produced by said first adaptive filter and a second echo return toss produced 
by said second adaptive filter, 

• comparing said first echo return loss with a fourth predetermined threshold value and comparing said second 
' echo return loss with a fifth predetermined threshold value; and 

• wherein when said first echo return loss Is greater than said fourth predetermined threshold value or said 
second echo return loss is greater than said fifth predetermined threshold value, said step of selecting an 
updated cancellation state selects a hangover state indicative of a pause In said near-end speech signal; and 
when said first echo return loss is less than or equal to said fourth predetermined threshold value and said 
second echo return loss is less than or equal to said fifth predetermined threshold value, said step of selecting 
an updated cancellation state again selects said doubletalk state. 

17. A method for controlling echo cancellation in an echo canceller, said echo canceller having one cancellation state 
of a predetermined plurality of states based on the presence of near-end and far-end speech signals, said echo 
canceller for generating a residual signal for output based on said cancellation state, said residual signal generated 
by filtering a far-end speech signal with a first adaptive filter to form an echo estimate signal and subtracting said 
echo estimate signal from a combined echo and near-end speech signal to form said residual signal, said echo 
canceller also having a second adaptive filter for filtering said far-end speech signal to generate a state signal for 
determining the presence of said near-end and far-end speech signals, comprising the steps of: 

• from a silent state Indicative of the absence of said near-end speech signal and said far-end speech signal, 
transitioning into a far-end speech state when said far-end speech signal is present, and while in said far-end 
speech state indicative of the presence of only said far-end speech signal, updating coefficients of said first 
adaptive filter and synthesizing a noise signal for output in place of said residual signal; 

• from said far-end speech state, transitioning into said silent state when said far-end speech signal is no longer 
present; 

• from said silent state, transitioning into a near-end speech state when said near-end speech signal is present, 
and while in said near-end speech state indicative of the presence of only said near-end speech signal, sus- 
pending update of said first and second adaptive filters and generating said residual signal for output; and 

• from said silent state, transitioning Into a doubletalk state when said near-end and far-end speech signals are 
both present, and while in said doubletalk state indicative of the presence of both said near-end and far-end 
speech signals, updating coefficients of said second adaptive filter and generating said residual signal for 
output 

1 8. The method of claim 17 further comprising the steps of: 

• from said far-end speech state, transitioning into said near-end speech state when said near-end speech 
signal is present and said far-end speech signal is no longer present, and while in said near-end speech state, 
suspending update of said first and second adaptive filters and generating said residual signal for output; and 

• from said near-end speech state, transitioning Into said doubletalk state when said near-end and far-end 
speech signals are both present and while in said doubletalk state, updating coefficients of said second adap- 
tive filter and generating said residual signal for output. 

19. The method of claim 1B further comprising the steps of: 

• from said near-end speech state, transitioning into said doubletalk state when said far-end speech signal is 
present and said near-end speech signal is still present and while in said doubletalk state, updating coefficients 
of said second adaptive filter and generating said residual signal for output; 

• from said doubletalk state, transitioning into said near-end speech state when said near-end speech signal is 
still present and said far-end speech signal is no longer present, and while In said near-end speech state, 
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suspending update of said first and second adaptive fitters and generating said residual signal for output; 

• from said near-end speech state, transitioning into a hangover state when said near-end speech signal is no 
longer present, and while in said hangover state indicative of a pause in said near-end speech signal, sus- 
pending update of said first and second adaptive filters and generating said residual signal for output; and 

• from said hangover state, transitioning into said near-end speech state when said near-end speech signal is 
again present within a predetermined period of time and said far-end speech is not present, and while in said 
near-end speech state, suspending update of said first and second adaptive filters and generating said residual 
signal for output 

20. The method of claim 19 further comprising the steps of: 

• from a hangover state indicative of a pause in said near-end speech signal, transitioning into said silent state 
when said far-end speech signal is not present and said near-end speech signal is not present after a prede- 
termined period of time; and 

• from said hangover state, transitioning into said far-end speech state when said far-end speech signal is still 
present and said near-end speech signal is not present after a predetermined period of time, and while in said 
far-end speech state, updating coefficients of said first adaptive filter and synthesizing a noise signal for output 
in place of said residual signal. 

21 . The method of claim 18 further comprising the steps of: 

• from said doubletaik state, transitioning into a hangover state when said near-end speech signal is no longer 
. present and said far-end speech signal is still present, and while In said hangover state indicative of a pause 

in said near-end speech signal, continuing to suspend update of said first and second adaptive filters and 
generating said residual signal for output; and 

• from said hangover state, transitioning into said doubletaik state when said near-end speech signal Is again 
present within a predetermined period of time and said far-end speech signal is still present, and while in said 
doubletaik state, updating coefficients of said second adaptive fitter and generating said residual signal for 
output 

22. The method of claim 21 further comprising the steps of: 

• from said hangover state, transitioning into said silent state when said far-end speech signal is not present 
and said near-end speech signal is not present after a predetermined period of time; and 

• from said hangover state, transitioning into said far-end speech state when said far-end speech signal Is still 
present and said near-end speech signal is not present after a predetermined period of time, and while in said 
far-end speech state, updating coefficients of said first adaptive filter and synthesizing a noise signal for output 
in place of said residual signal. 

23. The method of claim 18 further comprising the steps of: 

• from said doubletaik state, transitioning into said hangover state when said near-end speech signal is no longer 
present and said far-end speech signal is still present, and while in said hangover state, continuing to suspend 
update of said first and second adaptive fitters and generating said residual signal for output; and 

• from said hangover state, transitioning into said doubletaik state when said near-end speech signal is again 
present within a predetermined period of time and said far-end speech signal is still present, and while in said 
doubletaik state, updating coefficients of said second adaptive filter and generating said residual signal for 
output. 

24. The method of claim 23 further comprising the steps of: 

• from said hangover state, transitioning into said silent state when said far-end speech signal is not present 
and said near-end speech signal is not present after a predetermined period of time; and 

• from said hangover state, transitioning into said far-end speech state when said far-end speech signal is still 
present and said near-end speech signal is not present after a predetermined period of time, and while in said 
far-end speech state, updating coefficients of said first adaptive filter and synthesizing a noise signal for output 
in place of said residual signal. 



24 



EP 1 152 547 A2 



25. The method of claim 18 further comprising the steps of: 

• from a hangover state indicative of a pause in said near-end speech signal, transitioning into said silent state 
when said far-end speech signal is not present and said near-end speech signal is not present after a prede- 

5 termlned period of time; and 

• from said hangover state, transitioning Into said far-end speech state when said far-end speech signal is still 
present and said near-end speech signal is not present after a predetermined period of time, and while in said 
far-end speech state, updating coefficients of said first adaptive filter and synthesizing a noise signal for output 
in place of said residual signal. 

10 

26. The method of claim 17 further comprising the steps of: 

• from said near-end speech state, transitioning into said doubietaJk state when said far-end speech signal is 
present and said near-end speech signal is still present, and while in said doubletalk state, updating coefficients 

is of said second adaptive filter and generating said residual signal for output; 

• from said doubletalk state, transitioning into said near-end speech state when said near-end speech signal is 
still present and said far-end speech signal Is no longer present, and while in said near-end speech state, 
suspending update of said first and second adaptive filters and generating said residual signal for output; 

• from said near-end speech state, transitioning into a hangover state when said near-end speech signal is no 
20 longer present, and while In said hangover state indicative of a pause in said near-end speech signal, sus- 
pending update of said first and second adaptive filters and generating said residual signal for output; and N 

• from said hangover state, transitioning into said near-end speech state when said near-end speech signal is 
again present within a predetermined period of time and said far-end speech is not present, and while In said 
near-end speech state, suspending update of said first and second adaptive filters and generating said residual 

25 signal for output 

27. The method of claim 26 further comprising the steps of: 

• from said doubletalk state, transitioning into said hangover state when said near-end speech signal is no longer 
30 present and said far-end speech signal is still present, and while in said hangover state, continuing to suspend 

update of said first and second adaptive filters and generating said residual signal for output; and 

• from said hangover state, transitioning into said doubletalk state when said near-end speech signal is again 
present within a predetermined period of time and said far-end speech signal is still present, and while in said 
doubletalk state, updating coefficients of said second adaptive filter and generating said residual signal for 

35 output 

28. The method of claim 27 further comprising the steps of: 

• from said hangover state, transitioning into said silent state when said far-end speech signal is not present 
40 and said near-end speech signal is not present after a predetermined period of time; and 

• from said hangover state, transitioning into said far-end speech state when said far-end speech signal is still 
present and said near-end speech signal is not present after a predetermined period of time, and while in said 
far-end speech state, updating coefficients of said first adaptive filter and synthesizing a noise signal for output 
in place of said residual signal. 

45 

29. The method of claim 26 further comprising the steps of: 

• from said hangover state, transitioning into said silent state when said far-end speech signal is not present - 
and said near-end speech signal is not present after a predetermined period of time; and 

so . from said hangover state, transitioning into said far-end speech state when said far-end speech signal is still 

present and said near-end speech signal is not present after a predetermined period of time, and while In said 
far-end speech state, updating coefficients of said first adaptive filter and synthesizing a noise signal for output 
in place of said residual signal. 

55 30. The method of claim 17 further comprising the steps of: 

• from said doubietaJk state, transitioning into a hangover state when said near-end speech signal is no longer 
present and said far-end speech signal is still present, arid while in said hangover state indicative of a pause 
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in said near-end speech signal, continuing to suspend update of said first and second adaptive filters and 
generating said residual signal for output; and 

• from said hangover state, transitioning into said doubletatk state when said near-end speech signal is again 
present within a predetermined period of time and said far-end speech signal is still present, and while in said 
doubletaDc state, updating coefficients of said second adaptive filter and generating said residual signal for 
output 

31 . The method of claim 30 further comprising the steps of: 

• from said hangover state, transitioning into said silent state when said far-end speech signal is not present 
and said near-end speech signal is not present after a predetermined period of time; and 

• from said hangover state, transitioning into said far-end speech state when said far-end speech signal is still 
present and said near-end speech signal Is not present after a predetermined period of time, and while in said 

' far-end speech state, updating coefficients of said first adaptive filter and synthesizing a noise signal for output 
in place of said residual signal. 

32. The method of claim 17 further comprising the steps of: 

• from a hangover state indicative of a pause in said near-end speech signal, transitioning Into said silent state 
when said far-end speech signal is not present and said near-end speech signal is not present after a prede- 
termined period of time; and 

• from said hangover state, transitioning into said far-end speech state when said far-end speech signal Is still 
present and said near-end speech signal is not present after a predetermined period of time, and while in said 
far-end speech state, updating coefficients of said first adaptive filter and synthesizing a noise signal for output 
in place of said residual signal. 
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